Python——np.nan, None的判断和比较

Python值的判断与比较: np.nan, None


None

1
2
3
4
5
6
7
8
type(None)
# Output: <type 'NoneType'>

None is None
# Output: True

None == None
# Output: True

np.nan

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
type(np.nan)
# Output: <type 'float'>

np.nan == np.nan
# Output: False

np.nan is np.nan
# Output: True

np.nan != np.nan
# Output: True

np.nan > np.nan
# Output: False

np.nan < np.nan
# Output: False

np.nan 与 None

1
2
3
4
5
6
7
8
None == np.nan
# Output: False

None != np.nan
# Output: True

None is np.nan
# Output: False

与数字的比较

1
2
3
4
5
6
7
8
np.nan > 10
# Output: False

np.nan < 10
# Output: False

np.nan == 10
# Output: False

总结

  • 【Python2和Python3表现相同】
    np.nan 只有在np.nan != np.nan或者np.nan is np.nan时为True, 其他情况下和数字比较(包括和自身)都为False
  • 【Python2和Python3表现相异】
    Python3与Python2在直接使用np.nan时表现正常,但是当涉及到DataFrameNaN时表现不同

特殊情况

DataFrame中的NaN与数字比较时会出现有时候为True有时候为False的情况

  • 这种情况出现在Python3中,当NaN 与数字比较时
    • 此时对于列属性类型为数值型,那么返回False
    • 否则返回True
  • Python2中NaN和数字的就是np.nan和数字比较的结果,都为False

Python2与Python3比较

  • 代码示例:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Python3:
import pandas as pd
df = pd.DataFrame([['a',2,3], ['a',3,4], ['a',8,9]], index=['a', 'b', 'c'])
df = df.reindex(['a', 'b', 'c', 'd', 'e', 'f'])
print(df)
# Output:
0 1 2
a a 2.0 3.0
b a 3.0 4.0
c a 8.0 9.0
d NaN NaN NaN
e NaN NaN NaN
f NaN NaN NaN

print(df > 5)
# Output:
0 1 2
a True False False
b True False False
c True True True
d True False False
e True False False
f True False False

print(df.values)
# Output:
[['a' 2.0 3.0]
['a' 3.0 4.0]
['a' 8.0 9.0]
[nan nan nan]
[nan nan nan]
[nan nan nan]]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Python2:
import pandas as pd
df = pd.DataFrame([['a',2,3], ['a',3,4], ['a',8,9]], index=['a', 'b', 'c'])
df = df.reindex(['a', 'b', 'c', 'd', 'e', 'f'])
print(df)
# Output:
0 1 2
a a 2.0 3.0
b a 3.0 4.0
c a 8.0 9.0
d NaN NaN NaN
e NaN NaN NaN
f NaN NaN NaN

print(df > 5)
# Output:
0 1 2
a True False False
b True False False
c True True True
d False False False
e False False False
f False False False

print(df.values)
# Output:
[['a' 2.0 3.0]
['a' 3.0 4.0]
['a' 8.0 9.0]
[nan nan nan]
[nan nan nan]
[nan nan nan]]